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ABSTRACT 

We study the relation between halo mass and its environment from a probabilistic 
perspective. We find that halo mass depends not only on local dark matter density, 
but also on non-local quantities such as the cosmic web environment and the halo- 
exclusion effect. Given these accurate relations, we have developed the HADRON-code 
(Halo mAss Distribution ReconstructiON), a technique which permits us to assign 
halo masses to a distribution of haloes in three-dimensional space. This can be ap¬ 
plied to the fast production of mock galaxy catalogues, by assigning halo masses, 
and reproducing accurately the bias for different mass cuts. The resulting clustering 
of the halo populations agree well with that drawn from the BigMultiDark A-body 
simulation: the power spectra are within 1-cr up to scales of fc = 0.2/iMpc”^, when 
using augmented Lagrangian perturbation theory based mock catalogues. Only the 
most massive haloes show a larger deviation. For these, we hnd evidence of the halo- 
exclusion effect. A clear improvement is achieved when assigning the highest masses 
to haloes with a minimum distance separation. We also compute the 2- and 3-point 
correlation functions, and hnd an excellent agreement with A-body results. Our work 
represents a quantitative application of the cosmic web classiHcation. It can have fur¬ 
ther interesting applications in the multi-tracer analysis of the large-scale structure 
for future galaxy surveys. 

Key words: (cosmology:) large-scale structure of Universe - galaxies: clusters: gen¬ 
eral - catalogues - galaxies: statistics 


1 INTRODUCTION 

Hierarchical structure formation in the Cold Dark Mat¬ 
ter theory predicts the production of gravitationally bound 


compact objects called haloes (White & Rees 1978 Fry & 


Peebles 19781. They host the galaxies we observe in our 


Universe according to the standard cosmological paradigm. 
Nevertheless, their biased relationship with respect to the 
underlying dark matter distribution remains still a matter 
of study. In spite of the great progress made during the last 
decades some questions have not been fully answered yet 


(for a review, cf. Cooray fc Sheth|2002 I. We certainly know 


now that bias is nonlinear and scale-dependent (e.g. Nuza 
et al.|20T3K Recent studies have shown that non-local (e.g. 


Saito et al.|2014| and stochastic contributions (e.g. Kitaura 


et al.||2014aP are also relevant in the three-point clustering 


statistics. In fact, parametrized bias expressions are degen¬ 


erate in the two-point clustering statistics (Kitaura et al. 
|2014b| . 

A proper bias weighting or mass weighting can re¬ 
duce the variance of the clustering measurements ( [Percivak] 
Verde, fc Peacock||2004 Seljak et al.||2009 L These methods 
can be applied to the galaxy surveys (e.g., SDSS-III/BOS^ 
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(Eisenstein et al?]|2011 Dawson et al.|[?013l), of which one 


can estimate the biases or masses of the galaxy sample. The 


(Frieman et al. 

2013), 


1 Benitez et al. 

1! IS" , T • • 

2014), 


4MOS'lj^de Jong et al. 2012 l or Euclicj^ (Laureijs et al. 


20091) will exploit the multi-tracer analysis to constrain 


dark energy, the growth rate of the Universe, and hence 
gravity models (e.g. [McDonald fc Seljak||200^ [Blake et al.j 
2013). In such a multi-tracer approach, different population 


of tracers of the cosmic density field are treated as indepen¬ 
dent measurements, which weighted by their distinct bias 


(cf. McDonald & Seljak 2009 Seljak et al. 2009 Hamaus 
et al. |2010[), will yield much tighter cosmological constraints 


(cf. also [Abramo fc Leonard[[2013p . In this context, it is 
fundamental to have a deep understanding of the bias for 
different population of tracers. 

We aim at answering several questions in this study, 
such as: how are haloes distributed in the cosmic web, and 
which properties determine the bias of different halo popula¬ 
tions? In particular, we investigate in this work the relation 
between halo mass and environment. As a practical applica¬ 
tion, we want to understand how to statistically assign halo 
mass to a mixed population of haloes. We will present in a 
subsequent publication how to extend this work to galaxy 
stellar masses (Kitaura et ah, in prep.; Rodriguez-Torres et 
al., in prep.). 

The technique presented in this work has a direct ap¬ 
plication to the fast generation of mock galaxy or halo cat¬ 
alogues, and could be applied for the reconstruction of halo 


masses and density held (cf. e.g. 

Wang et al.[[2009 

Munoz- 

Cuartas, Muller, & Forero-Romero[[2011 [Wang et al.[[2012| 

Munoz-Cuartas & Muller 

2012 

Kitaura [2013 

), and to the 


multi-tracer analysis from galaxy redshift surveys. 

This paper is structured as follows: In section we 
first present the theoretical approach. Then, we show in sec¬ 
tion [^ our numerical experiments bas ed on the BigMulti - 
Dark (BigMD) A'-body simulation^ (Klypin et al. 2014), 
which we described in section]^ followed by the application 
to the generation of mock galaxy/halo catalogues based on 
perturbation theory in section Finally, we present our 
conclusions in section (6) 


2 THEORETICAL APPROACH 

The aim of this study is to examine the properties of the 
large-scale structures which statistically determine the mass 
of haloes. The starting point is given by the mass function, 
which predicts the number of compact objects (haloes) of 
a certain mass (cf. pioneering works of Press & Schechter 
119741; Bardeen et al.|( 19861, and the later seminal works by 


^ http://www.sdss3.org/future/ 

® http://desl.lbl.gov/ 

http: //WWW . darkenergysurvey. org 
® http://www.lsst.org/lsst/ 

® http://j-pas.org/ 

^ http://www.aip.de/en/research/research-area-ea/ 
research-groups-and-projects/4most 
“ http://www.euclid-ec.org 
® http: //WWW. multidark.org/MultiDark/ 


Mo & White (19961; Sheth & Tormen (2004a l). The question 


is which additional quantities {g} have a significant impact 
on the mass of haloes from a statistical perspective. In par¬ 
ticular, we want to answer: what determines the conditional 
probability distribution function of the halo mass of an 
object at position given a distribution of haloes in three- 
dimensional space {rh}, and at redshift 2 with cosmological 
parameters {pc}, i.e. 


Ml rv P(Mh(rh)|{rh}, {<?}, {pc}, z). 


( 1 ) 


The accuracy of this mass assignment will have an im¬ 
pact on various statistical measures, such as the two- and 
the three-point clustering statistics. Hence, it controls the 
bias, which is the ultimate goal of this work. Here, we fol¬ 
low a hierarchical approach in which we include increasingly 
more information in the conditional probability distribution 
function, evaluating at each stage the precision of the re¬ 
sulting bias. From theoretical considerations based on the 
literature, we need to examine the impact of nonlinear lo¬ 
cal, non-local, and stochastic components of the bias (e.g. 


Press & Schechter||1974| IPeacock & Heavens||19851 |Bardeen 

et al.[[1986|[Fry & Gaztanaga[[1993||Mo & White|1996| Pen 

1998 

Dekel & Lahav 1999| [Sheth & Lemson |1999| [Seljak 

2000 

Berlind & WeinDerg|2002| Smith et af.|2007| [McDon- 

aid & Roy|2009| [Desjacques et al.|2010 [Beltran Jimenez & 

Durrer[[2011 

[Vafageas & JNishimichi||2011| Elia et af.||2012 

Chan et af.| 

2012| Bafdauf et al. 2012 2013| [Angulo et af. 


(i) We start with the simplest assumption based solely on 
the mass function neglecting any other information: 
■P(A7h|{T’h},{pc},2). 

(ii) The peak background split picture models the forma¬ 
tion of haloes of different masses in density peaks above cor¬ 
responding density thresholds ( [Kaiser[[l984[ [Bardeen et al.[ 
1986 Sheth & Tormen 2004b I. This theory indicates that 


we should consider as a next order approximation the de¬ 
pendence on the local density field pyi: 
P(M](|{rh},pM,{pc},2). 

(hi) Recent studies have shown that non-local effects are 


relevant in the three-point clustering statistics (e.g. Saito 
et al.|2014 ). In particular, we are interested in investigating 
the importance of the cosmic environment in which different 
haloes reside. We will study this through the eigenvalues of 
the tidal field tensor T, which is a non-local measure: 
P(M^|{rh}, pM, T, {pc}, z). 

(iv) Finally, we aim at investigating stochastic biasing 
(e.g. Kitaura et al. 2014a|b ). This component encodes in 
an effective way the non-local bias contributions (in this 
case beyond the tidal field tensor). We will consider in par¬ 
ticular the deviation from Poissonity. A larger dispersion 
than Poisson corresponds to over-dispersion, a smaller one 
to under-dispersion, which are due to the positive or nega- 


tive correlation on sub-grid scales, respectively (e.g. 

Peebles 

19801 [Somerville et al.|2001|[Casas-Miranda et al.|2002|[Bal- 

dauf et al.|2013 

Kitaura et al.|2014a 

1 . We will focus in this 


study on the minimum separation between haloes Ar”i„: 
P(M^|{rh},pM,r,ACi„,{pc},z). 

In the next section we will investigate the relevance of 
the different bias components {q} = {pM, T, Ar^^jj,}, based 
on the analysis of a large A-body cosmological simulation. 
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Halo mass distribution reconstruction 3 


3 REFERENCE iV-BODY SIMULATION AND 
HALO CATALOGUES 


In particular, we employ the dark matter particle and halo 
catalogues at redshift 2 = 0.5618, extracted from one of 
the BigMD simulations, which was performed using the 
TreePM A-body code GADGET-2 ( Springel|2005 1 with 3840® 
particles in a volume of (2.5 h“®Gpc)®, within the framework 
of Planck ACDM cosmology with {flm = 0.307115, fli, = 
0.048206, (Tg = 0.8288, ris = 0.96}, and the Hubble parame¬ 
ter {Ho = 100/i kms“®Mpc“®) given hy h = 0.6777. 

We have two sets of halo catalogues constructed by 
using two different halo finders, the spherical overdensity 
Bound Density Maxima (BDM) ||Klypin fc Holtzman||199~ 


Gottloeber 1998| Riebe et al. 20111 and the Friends-of- 


Friends (FoF) | |Gottloeber| 1998P Riebe et al. 20111 halo 

Hnders with linking length I = 0.17 times the mean in¬ 
terparticle distance (0.11 /i“®Mpc). We select the BDM 
haloes and subhaloes by Umax (i.e., maximum circular ve¬ 
locity), and use mass for FoF haloes, to construct com¬ 
plete samples from both halo catalogues with number den¬ 
sity 3.5 X 10“^ h® Mpc“®, as that for typical Luminous Red 
Galaxies in large-scale surveys. 



( 2 ) 


We choose this quantity as a proxy for halo mass, since (cf. 
|Prada et al.|[20T^ , 1) it is a more reliable quantity than 
the mass defined at a given over-density, and 2) it is better 
for the characterization of haloes when relating them to the 
galaxies inside. It has a more direct relation with observa¬ 
tional quantities, such as luminosity or stellar mass, that are 
used for defining galaxy catalogues with Halo Abundance 
Matching (HAM) modelling | |Trujillo-Gomez et al.|[201ip . 
We will focus in this work on BDM (sub)haloes and show 
some results using FoF haloes in Appendix 


4 THE HALO MASS ENVIRONMENTAL 

DEPENDENCE 

Haloes are generally identified in an A-body dark matter 
density field using the so-called halo-finder algorithms. This 
is essentially an estimate of the halo bias, which encodes a 
certain relation between halo masses and the dark matter 
density field. 

Therefore, let us start by studying the bias from A-body 
cosmological simulations that provide both dark matter par¬ 
ticles and the corresponding halo catalogues. To follow the 
analysis proposed in the previous section we will investigate 
the clustering statistics of different populations of haloes 
conditioned on different degrees of information. 

To evaluate the accuracy of the bias, we will start with 
the two-point statistics in Fourier space. In particular, we 
compute power spectra of different populations of haloes, 
whose amplitudes are essentially a direct estimate of the 
bias factor (e.g. Cen fc Ostriker|1992 1, and their shapes show 
the scale-dependency. In this work, we adopt the cloud-in- 
cells particle assignment scheme (GIG) for haloes with grid 
size of 512® to perform the fast Fourier transform, and then 



Figure 1. Halo mass (Umax) functions of the original data drawn 
from the BigMD simulation and re-assigned halo catalogue. They 
agree with each other by construction. The error bars show Pois¬ 
son errors. 


compute the power spectra with aliasing and shot noise cor¬ 
rections taken into account (cf. Jing||20d5 h 

Our aim is to reduce systematic deviations on the power 
spectra to a few percent on scales relevant to baryon acous¬ 
tic oscillations, i.e. fe 0.2/i Mpc“®. We study in this sec¬ 
tion different mass assignment procedures, and show how 
this goal can be achieved, provided we take into account 
density-mass relation, cosmic web environment, and halo- 
exclusion. Let us now perform our numerical analysis trying 
to recover the masses of a given three-dimensional distribu¬ 
tion of haloes going through the steps outlined in 3^ 


4.1 Mass function 

We start by studying the halo Unax reconstruction consider¬ 
ing only the mass (Unax) function, i.e. the conditional proba¬ 
bility function P{M{^\{rh}, {pc},^). In particular, each halo 
gets a Umax assigned following the function shown in Fig. 
regardless of its location. In this case, the Umax cumulative 
function is reproduced by construction. 

To verify the performance of the halo Umax reconstruc¬ 
tion, we divide both the original BDM halo catalogue and 
the re-assigned one, into 8 sub-samples of halo Umax and 
compare the clustering for each sub-sample. In particular, 
we cut the catalogues in Umax bins of {[<410), [410-427), 
[427-447), [447-472), [472-501), [501-550), [550-640), [^640[} 
kms“® to have similar number of haloes in each bin. We 
can see in Fig. [^ how the power spectra of the different 
sub-samples have amplitudes which disagree with the true 
ones. The degree of the systematic deviation depends on the 
mass of the halo population. Low-mass haloes (Umax~ 500 
kms“®) yield an overestimation of the bias (the ratio of the 
reconstructed and the true power spectrum is greater than 
one). Only haloes in the range around Unax~ 500 — 550 
kms“® are closely unbiased, and more massive haloes lead 
to an underestimation of the bias (the ratio of the recon¬ 
structed and the true power spectrum is lower than one). 
Nevertheless, the deviations of up to ~40% throughout the 
full fc-range in the power spectra hint that we need to con- 
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Figure 3. Number of (sub)haloes in the BigMD simulation with certain Fmax and local DM density. 



k [h Mpc 


Figure 2. Power spectra of the halo sub-samples discussed in the 
text, with different halo Fmax- Dashed lines indicate the power 
spectra for the sub-samples drawn from the BigMD BDM cata¬ 
logue, while solid lines correspond to that after applying the mass 
assignment procedure. MF stands for the mass (Fmax) function 
used in this case, see Fig.^ The different colour codes correspond 
to the different Fmax bins of the sub-samples increasing from the 
bottom to the top lines ({[<410), [410-427), [427-447), [447-472), 
[472-501), [501-550), [550-640), [^640)} kms-l). For visualisation 
purposes the power spectra corresponding to different Fmax bins 
have been multiplied with different constants in the upper panel 
to enlarge the differences 


sider additional environmental indicators to make a precise 
mass assignment. 

4.2 Density—halo mass relation 

Let us now investigate the relation between the halo mass 
(or equivalently Vlnax) and the local dark matter density. 
We employ the CIC scheme for dark matter particles in the 



Figure 4. Power spectra of the sub-samples with different halo 
Fmax. Same convention as in Fig. VD stands for the Fmax- 
density relation used in this case. 

BigMD simulation with a grid size of 960® to obtain the dark 
matter density field, and then distribute haloes to the same 
mesh using the nearest-grid-point scheme (NGP), since as 
our application, the mocks employ CIC while computing the 
dark matter density field, and we need to use integers for the 
mass assignment procedure. For each halo, the local dark 
matter density is defined by the number density contrast 
of dark matter particles in the corresponding cell, i.e., 1 + 
5dm = p/p, with 5dm being the density fluctuations, p the 
density, and p the mean density. 

The left panel in Fig. shows the distribution of all 
haloes, and subhaloes only in the Fmax vs. local dark mat¬ 
ter density plane, where we adopt 500 bins for both Fmax 
and dark matter density. Massive haloes tend to be located 
in dense environments, obeying a power law. Interestingly, 
there is a halo mass range suppressed at high densities, indi¬ 
cating that these moderately massive haloes are merged to 
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Figure 5. Dark matter density field in the BigMD simulation 
corresponding to different cosmic web structures, as indicated in 
the legend of the panels. 


more massive ones in such environments. Another remark¬ 
able feature is that there are two branches in the high density 
regions, indicating two different groups of haloes residing in 
the same dark matter environment. The haloes in the BDM 
catalogue can be divided into distinct and subhaloes. Thus, 
we find that haloes in the low-VAax region are predominantly 
sub-structures of those with higher Vmax (see right panel of 
Fig.|3f. This is why the haloes in very dense regions (e.g. 
1 + <5dm > 400) are located in discrete bins of constant 
1 + 5 dm in Fig. as can be more clearly seen in the outliers 
at high 1 + 5dm values. This relation shows the probability 
of Ending a halo with a given h)nax in a certain dark matter 
density environment. 

We then keep the local dark matter density of each halo, 
and assign Vlnax to the haloes according to the extracted 
probability, ignoring their original Vlnax- The new catalogue 
has the same Knax-density relation as the original one, given 
the same Vlnax and density bins. The re-assignment proce¬ 
dure is equivalent to a shuffling of the halo Vlnax in each 
dark matter density bin. 

A clear improvement is found with respect to the previ¬ 
ous results given in §4.1[ as can be seen in the power spectra 
shown in Fig. Nevertheless, the lowest and the next to 
largest Knax bins still show systematic deviations of about 
10 and 15% up to fc ~ 0.15/iMpc“^, respectively (and in¬ 
creasingly larger towards larger k). The largest Knax bin 
shows even a deviation exceeding 30% at A: ~ 0.15 /iMpc“^. 
This indicates that we still need to include more information 
to reach the desired accuracy of ~ 10%. 


4.3 Cosmic web environment 


As the next step, we now include non-local indicators. The 
tidal field tensor includes second order non-local informa¬ 
tion, and its eigenvalues have been used to make a cosmic 
web classification (e.g. |Hahn et al.| (|2007[) ; [Forero- Romero 


et al. 


(20091; Aragon-Calvo et al. (20101; [Hoffman et al. 


120121). Since different types of cosmic web structures can 



Figure 6. Vkiax-density relation of different cosmic web structure 
classes. The mass of knots increase from class 1 to class 3. 


have the same local matter density, our analysis in the previ¬ 
ous section mixed haloes residing in different structures and 
thus with different biases. Therefore, we split the haloes ac¬ 
cording, not only to their local dark matter density, but also 
to the type of cosmic web structures they live in. 

In particular, we classify cosmic web structures follow¬ 
ing Hahn et al. (20071 and Forero-Romero et al. (20091. 


From the dark matter density field, we obtain the gravita¬ 
tional potential (j> from the Poisson equation, and construct 
the tidal field tensor 


T. 

dXidXj 


(3) 


If all the three eigenvalues of Tij are above (below) a certain 
threshold (Ath), then the local structure is collapsing (ex¬ 
panding) in all directions, forming a knot (void). When one 
(two) eigenvalue is above the threshold, we have filament¬ 
like (sheet-like) structures. 

A slice of the dark matter density field for Ath = 0 
is shown in Fig. for different types of cosmic web struc¬ 
tures. Although voids, sheets, and filaments occupy most of 
the volume, the majority of the haloes are located in knots. 
To be more precise, we further classify knots by their to¬ 
tal enclosed mass into several classes. To compute the mass 
of each knot, we adopt a simple FoF algorithm resolving 
single knots from the dark matter density field. The mesh 
cells of knots are marked during the cosmic web classifica¬ 
tion, and subsequently all marked cells that are next to each 
other are merged to construct a single knot. The mass of the 
knot is then proportional to the sum of dark matter mass 
of all the connected cells. Our analysis shows that, for the 
halo masses considered in this study, the distinction between 
non-knots structures does not add any information. There¬ 
fore, we combine the rest of structures into a single specie. 
This situation may however be different when considering 
lower mass haloes. 

With the different types of cosmic web structures re¬ 
solved from the dark matter density field, we can then ex¬ 
tract the relation between the halo Vjnax and the dark mat¬ 
ter density for each class, as shown in Fig. In this plot. 


© 0000 RAS, MNRAS 000, 000-000 



































































6 


Zhao et al. 



Figure 7. Power spectra of the sub-samples with different halo 
Fmax- Same convention as in Fig. CW stands for the cosmic 
web classification additionally used in this case. The dotted lines 
indicate the results of §4.2| 


Figure 8. Power spectra of the sub-samples with different halo 
Vmax- Same convention as in Fig. EX stands for the halo- 
exclusion additionally used in this case. The dotted lines indicate 
the results in §4.3| 


only three classes of knots are shown to illustrate the var¬ 
ious 14nax-density relations as a function of the mass of 
knots. Nevertheless, the total number of classes in the re¬ 
assignment procedure exceeds 500. 

We build the same number of classes for the catalogue 
without halo mass, and then separately assign masses to 
haloes in different classes. This procedure leads to a clear 
improvement, as can be seen in Fig.[^ The power spectra are 
now compatible with the true ones up to A: ~ 0.15 h Mpc“^ 
within 5%. Nevertheless, the more massive haloes with the 
largest t4iax still show an increasing systematic deviation 
towards high k. The next to the largest mass bin deviates 
by ~10% at fe = 0.2/i Mpc“^, and more than 20% at fc = 
0.25/iMpc“^, while the largest mass bin deviates already 
more than 20% at fe = 0.1/iMpc“^. Let us therefore focus 
in the next section on the most massive haloes. 


4.4 Halo-exclusion 

According to the results of the previous section, the infor¬ 
mation of the mass function, local density, and cosmic web 
environment is not enough to accurately determine the mass 
of the most massive haloes. Still, a clear systematic deviation 
is present in the power spectra with a tendency to overesti¬ 
mate the clustering of massive objects (larger power towards 
high k). 

At this stage, we should note that the mass assignment 
we have conducted, although depending on the environment, 
followed a random procedure, hereby ignoring deviations 
from the Poisson distribution beyond the ones present in 
the actual three-dimensional distribution of haloes. In this 
sense, we did not distinguish between distinct haloes and 
subhaloes. A distinct halo, and its subhaloes sharing the 
same dark matter environment, could get the same assigned 
mass with equal probability. Thus the re-assigned catalogue, 
does not prevent two massive haloes from being arbitrarily 
close to each other, and hence leads to a higher power spec¬ 
trum. But actually, the halo-exclusion effect affecting mas¬ 
sive haloes yields under-dispersed (dispersion smaller than 


Poisson) distributions (e.g. Somerville et al. 2001| Casas- 
[Miranda et al.|2002[ [Baldauf et al.|2013[). 

We therefore consider now the minimum separation be¬ 
tween massive haloes In particular, to separate mas¬ 

sive haloes in the Wiax assignment procedure, we perform an 
additional operation by setting a Vjnax threshold. In order to 
distribute haloes with Knax above the threshold to different 
cells as far as possible, we follow a top-down procedure, be¬ 
ginning with the highest Vjnax and continuing towards lower 
values, randomly selecting un-occupied cells to ensure that 
there is only one such halo in each cell. However, for a rela¬ 
tively low Knax threshold, there might not be available cells 
for all the haloes. In this case, we assign two or even more 
haloes to one cell. 

This procedure can be refined using the distribution of 
separation between haloes for different mass bins extracted 
from A^-body simulations and applied in a stochastic way, 
as we do with the Kmax-density relation. We leave such a 
study for future work. Nevertheless, the procedure described 
above serves to test our hypothesis and leads to already 
clear improvements. In particular, we find that most of the 
mass (Kmax) bins show power spectra which are compatible 
with the true ones within about 5%, up to fc ~ 0.2/i Mpc“^. 
Only the most massive bin still shows a clear systematic 
deviation which has been reduced to less than 15% up to 
k ~ 0.15/i Mpc“^, indicating that we still need to increase 
the halo-exclusion effect for this mass bin (see Fig. [^. 

As a further demonstration of the halo-exclusion cor¬ 
rection performed in this section, we investigate the one¬ 
dimensional probability distribution function (PDF) of 
haloes with Knax above a threshold of 500kms“^, as shown 
in Fig.[^ Here we can see that the PDF matches the true one 
only after applying the correction. We know from previous 
studies that an accurate PDF is especially important regard¬ 
ing the higher order statistics (e.g. Kitaura et al.|2014b I. An 
analogous analysis has been conducted with FoF haloes and 
the results are shown in Appendix |A1| 

Let us now have a look at applications of this method 
and its performance in terms of 2- and 3-point statistics in 
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Figure 9. Probability distribution function of haloes with Vniax 
above 500kms“^. 



0.02 0.03 0.04 0.05 0.10 0.20 0.30 


k [h Mpc 

Figure 10. Comparison of power spectra for the mock catalogues 
before Fmax assignment. 


the next section, computed with the ntropy-npoint soft¬ 
ware, which is an exact n-point calculator using a kd-tree 
framework with trne parallel capability and enhanced ron- 
tine performance (Gardner et al.|2d07 McBride et al.|2011 1. 


5 APPLICATION: MOCK CATALOGUES 


In the previous section, we have developed a prescription 
which allows us to accurately (within 15% up to fc ~ 
0.15/iMpc“^) describe the complex halo mass-environment 
dependence. We dub this method the Halo mAss Distri¬ 
bution ReconstructiON code (hadron). As an application 
of our method, we study in this section the assignment 
of masses to mock halo catalogues constructed with per¬ 
turbation theory. In particular, we consider two methods, 
the PerturbAtion Theory Catalogue generator of Halo and 
galaxY distributions (patchy) (Kitaura et al. |2014a|b| ) and 
the Effective Zel’dovich approximation mocks (EZmocks) 
i Chuang et al.|2015 |. Both methods provide mock halo cat¬ 
alogues calibrated with the BigMD BDM and FoF halo cat¬ 
alogues, as well as the dark matter particle distributions. 
While PATCHY includes an explicit Eulerian nonlinear and 
stochastic bias description, EZmocks uses effective modifi¬ 
cations of the initial conditions and bias modelling to repro¬ 
duce the bias of objects in the hnal catalogue. The dark mat¬ 
ter density field used in EZmocks is given by the Zel’dovich 
approximation (|Zerdovich|1970[ ), while patchy uses ALPT 
(Kitaura & HeB|2013 1. The different approximations have an 


impact on the accuracy of the bias, as we will show below. 


5.1 The HADRON-code 

We outline below the steps included in the HADRON-code 
to assign masses to haloes constructed with approximate 
gravity solver based mock generators. 

(i) First, we compute the density held and cosmic web 
structures (knots, sheets, hlaments, and voids) according to 
the dark matter particles from a reference A-body simula¬ 
tion. Then, we further classify the knots into different classes 


according to their enclosed mass (see details in (4.31. As a 
result, we obtain the density (pdm) and cosmic web classih- 
cation type (tew) for each cell. 

(ii) Second, we compute the number of haloes in each 
density and cosmic web classiheation type bin according to 
the halo catalogue from the reference A-body simulation. 

(iii) Third, we take the dark matter particles according 
to the approximate catalogue mock generator, and compute 
the density and cosmic web type in an analogous way to step 
(i). Since these quantities are different for simulations and 
mocks, we rank order those from mocks, in order to have an 
equivalent population of haloes in each pnm and tew bin. 

(iv) Fourth, we apply halo-exclusion to the halo cata¬ 
logue from the mock generator, i.e. we assign mass above 
the threshold to haloes (see details in ^4.4[ ). 

(v) Finally, we assign the mass to the rest of the haloes. 
For each mock halo without mass (some of them have al¬ 
ready acquired a mass in stepKiv)]), we hnd the local density 


Pdm and cosmic web classification type tew- From step (ii) 


we have the distribution of halo masses for a given pdm and 
tew, we then choose the mass of this halo with the proba¬ 
bility from the corresponding distribution. 


5.2 Mass assignment in perturbation theory 
based mocks 

We apply the HADRON-code to patchy and EZmock based 
catalogues which have been generated using initial condi¬ 
tions based on the ones corresponding to the Planck BlcMD 
simulation but re-sampled to a lower resolution of 960^ cells. 
This reduces the cosmic variance, however, the population 
of haloes based on the dark matter field has still a random 
component (cf. [Kitaura et ah 2014a Chuang et al. 20151. 
As in section]^ we focus on the BDM halo catalogue from 
BigMD selected by Umax- 

Fig. shows the power spectra of the entire patchy 
and EZmock mock halo catalogues compared to the BDM 
halo catalogue from BlcMD, which serves as the reference 
for calibration of the mock catalogues. An analogous analy¬ 
sis has been conducted with FoF haloes and the results are 
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Figure 11. Power spectra of (left:) PATCHY and (right:) EZmook BDM mock catalogues in different \4nax bins after Vmax assignment. 


shown in Appendix |A2[ EZmocks uses a clouds-in-cell based 
population of haloes, whereas patchy uses counts-in-cell in 
the population step according to the negative binomial dis¬ 
tribution modelling over-dispersion, i.e., the deviation from 
Poissonity (Kitaura et al. 2014aI. This results in slightly 
better agreements in terms of the power spectra for EZ¬ 
MOCKS based catalogues when using CIC estimators, as we 
do here and can be seen in Fig.[^ We assign Vlnax to these 
two mock catalogues using the procedure described in §5.1| 
However, as we have mentioned, we have still some degrees 
of freedom within this procedure, in the definition of the cos¬ 
mic web structures (Ath) and the Vlnax threshold. We cali¬ 
brate these degrees of freedom according to the performance 
(power spectra of sub-samples) of the catalogues after the 
Knax assignment. In this case, we employ Ath = —0.25, and 
the Vinax threshold Vth = 500kms“^. Moreover, we gener¬ 
ate 100 PATCHY realizations with the same initial conditions, 
but changing the random seeds constructing the halo cata¬ 
logues. This permits us to estimate the error bars for the 
2- and 3-point statistics sharing the same large scale cosmic 
variance for a consistent comparison to the A^-body simu¬ 
lation. The comparison of power spectra for different Wnax 
bins is shown on the left and right panels of Fig. El for 
PATCHY and EZmock mock catalogues, respectively. 

We find that patchy reaches a significantly higher ac¬ 
curacy in the biased tracers, due to the more precise dark 
matter field used within the approach (ALPT vs Zel’dovich). 
In particular, the precision reached with patchy equals the 
one reached with the mass re-assignment tests using the N- 
body simulation (see (4.41, whereas the systematic devia¬ 
tions with EZmock grow to 20% within k < 0.2hMpc“^. 
This means that the remarkable performance of EZmock in 
terms of the global 2- and 3-point statistics (on large scales) 
as shown in several works ( [Chuang et al.|[2015| |2014[ ) suf¬ 
fers from its crude dark matter density approximation (on 
small scales) with the mass assignment scheme presented 
in this work. Let us therefore continue our analysis focused 
on PATCHY. Fig. shows the great performance of the 2- 
and 3-point correlation functions for different Knax bins us¬ 
ing PATCHY. We see that the deviation found in the power 
spectrum for the most massive bin is also apparent on small 
scales in the 2-point correlation function (see black lines in 



r [h 'Mpc] 

Figure 12. 2-point correlation functions of PATCHY BDM mock 
catalogues in different Vmax bins after Vlnax assignment. 


Fig. 131. Nevertheless, the BAO peak is matched within l-cr 
for all mass bins, showing an excellent agreement down to 
the smallest scales (~ 5/i~^Mpc), but for the most massive 
bin. The 3-point correlation function is essentially compati¬ 
ble with the A-body simulation for the different mass bins. 
It is interesting to observe how the anisotropy increases to¬ 
wards higher masses, showing that the tracers with larger 
mass are less homogeneously distributed across the cosmic 
web. 


6 CONCLUSIONS 

In this work, we have studied the probabilistic dependence 
of the halo mass distribution as a function of local and non¬ 
local indicators, such as the local density, the cosmic web 
environment, and the halo-exclusion effect. We have found 
complex non-finear relations between the halo mass and the 
local density field, showing a degeneracy between parent 
haloes and subhaloes in certain density environments, as ex¬ 
pected. Furthermore, we have used the non-local cosmic web 
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Figure 13. 3-point correlation functions of PATCHY BDM mock 
catalogues in different Fmax bins after l/max assignment. 

environment information according to the eigenvalues of the 
tidal field tensor. This permits us to find accurate statistical 
relations between the halo mass and the density and cos¬ 
mic web environment. Such relations can be used to assign 
masses to a distribution of haloes. We dubbed the implemen¬ 
tation of our method the hadron (Halo mAss Distribution 
ReconstructiON) code. We first have tested this on the halo 
distribution of the Planck BigMultiDark simulation by ig¬ 
noring the actual information of their masses to reconstruct 
them using the statistical relations found in this work. We 
furthermore tested this method on a halo distribution pro¬ 
duced by perturbation theory based codes, such as patchy 
and EZmocks. Our results show that accurate perturbation 
theory models are required to properly model the halo mass 
to density relation. In particular, augmented Lagrangian 
perturbation theory (ALPT), as opposed to the Zel’dovich 
approximation, permits us to dramatically reduce the errors. 
We find that the resulting populations (classified into differ¬ 
ent mass bins) of haloes using ALPT agree in terms of power 
spectra within la up to scales of fc = 0.2 for different mass 
cuts, demonstrating that we recover the correct scale de¬ 
pendent bias on those scales. Only the most massive haloes 
(hmax^SSO kms“^) show a larger deviation. For these, we 
find evidence of the halo-exclusion effect, as a clear improve¬ 
ment is achieved when assigning those high masses with a 
minimum separation. Furthermore, we have computed the 
two- and three-point correlation functions finding an excel¬ 
lent agreement for arbitrary mass cuts. 

This method can be applied for efficient massive produc¬ 
tion of mock halo or galaxy catalogues. Our work represents 
a quantitative application of the cosmic web classification. It 
can have further interesting applications in the multi-tracer 
analysis of the large-scale structure for future galaxy sur¬ 
veys. 
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APPENDIX A: FOF HALO CATALOGUES 
SELECTED BY MASS 

This appendix shows the analysis performed using FoF 
haloes in an analogous way to the BDM halo analysis shown 
in the main text (see jj4| and . 

Al A-body simulation based catalogues 

When applying the hadron method to FoF catalogues, we 
have to adopt mass assignment instead of Umax assignment, 
since Umax is not available for FoF haloes. In this case, the 
extracted mass-density relation is shown in Fig. |A1| 
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Figure A4. Power spectra of (left:) PATCHY and (right:) EZmock FoF mock catalogues in different mass bins after mass assignment. 



Figure A2. Power spectra of re-assigned sub-catalogues with 
different halo masses, in comparison to the original FoF mock 
catalogues. 
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Figure A3. Comparison of power spectra for the catalogues be¬ 
fore mass assignment. 


We then follow the same steps as for the Wiax assign¬ 
ment (see i.e. classify the cosmic web structures and em¬ 
ploy a mass threshold. To justify the re-assigned result, we 
also divide the catalogues into 8 sub-samples of different halo 
masses, i.e. {[<1.17), [1.17-1.33), [1.33-1.54), [1.54-1.84), 
[1.84-2.28), [2.28-3.08), [3.08-4.92), [>4.92]} x 10^^ 
Fig. |A2| displays the power spectra of the sub-samples drawn 
from the original BigMD FoF halo catalogue and that after 
mass re-assignment. The mass threshold for the exclusion 
operation are 2.51 x lO'^® h~^M q for both catalogues. 

We note that the performance for FoF is worse than 
for the BDM catalogue, reaching deviations of about 10% 
already at fe ~ 0.15 h Mpc“^. 


A2 Perturbation theory based catalogues 

For the FoF samples, the performance of patchy and EZ¬ 
MOCK is shown in Fig. |A3| With Ath = —0.25, and mass 
threshold of 2.51 x 10^® h~ Mq for the mass assignment pro¬ 
cedure, the power spectra for different mass bins is shown 
on the left and right panels of Fig. |A4| for patchy and EZ¬ 
MOCK mock catalogues, respectively. The performance for 
FoF with PATCHY is worse than for the BDM catalogues. We 
have seen in |Al| that this is also true for the A-body simula¬ 
tion. Nevertheless, another reason is that patchy has been 
designed to model over-dispersion, which is present in the 
BDM catalogues when taking the full population of haloes, 
including subhaloes, but not true for the FoF catalogue. 
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